The IBM speech activity detection system for the DARPA RATS program

نویسندگان

  • George Saon
  • Samuel Thomas
  • Hagen Soltau
  • Sriram Ganapathy
  • Brian Kingsbury
چکیده

We present the IBM speech activity detection system that was fielded in the phase 2 evaluation of the DARPA RATS (robust automatic transcription of speech) program. Key ingredients of the system are: multi-pass HMM Viterbi segmentation, fusion of multiple feature streams, file-based and speech-based normalization schemes, the use of regular and convolutional deep neural networks, and model fusion through frame-level score combination of channel-dependent models. These techniques were instrumental in achieving a 1.4% equal error rate on the RATS phase 2 evaluation data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing a Speech Activity Detection System for the DARPA RATS Program

This paper describes the speech activity detection (SAD) system developed by the Patrol team for the first phase of the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state of the art detection capabilities on audio from highly degraded communication channels. We present two approaches to SAD, one based on Gaussian mixture models, and one based on multi-la...

متن کامل

Improving the speech activity detection for the DARPA RATS phase-3 evaluation

This paper presents the work that we conducted for building the speech activity detection (SAD) systems for the phase 3 evaluation of the RATS program. The work focused on improving the SAD performance with the neural network (NN) approach. The major efforts include reducing the false rejections errors by extensions of speech regions in the training references and use of post-processing NNs, an...

متن کامل

Neural network acoustic models for the DARPA RATS program

We present a comparison of acoustic modeling techniques for the DARPA RATS program in the context of spoken term detection (STD) on speech data with severe channel distortions. Our main findings are that both Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs) outperform Gaussian Mixture Models (GMMs) on a very difficult LVCSR task. We discuss pre-training, feature sets and ...

متن کامل

Patrol Team Language Identification System for DARPA RATS P1 Evaluation

This paper describes the language identification (LID) system developed by the Patrol team for the first phase of the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state of the art detection capabilities on audio from highly degraded communication channels. We show that techniques originally developed for LID on telephone speech (e.g., for the NIST langua...

متن کامل

A Hybrid Machine Learning Method for Intrusion Detection

Data security is an important area of concern for every computer system owner. An intrusion detection system is a device or software application that monitors a network or systems for malicious activity or policy violations. Already various techniques of artificial intelligence have been used for intrusion detection. The main challenge in this area is the running speed of the available implemen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013